Handling within-writer variability and between-writer variation in the recognition

نویسنده

  • Lambert Schomaker
چکیده

Although the performance of current automatic recognition algorithms of on-line handwriting has much improved in recent years, there are still many problems with the actual application of these systems. It appears that the step from academical experiments to real-life use of such algorithms, in, e.g., portable pen computers, is still difficult. What is particularly intriguing is the fact that reported academic (and commercial) recognition rates usually are 10-20% overestimated. When such systems are given the real test, i.e., use by any writer, in a realistic application such as note taking during lectures, their performance drops sharply. One reason lies in the fact that for on-line handwriting, only limited training databases exist. A project is currently running to alleviate this problem: UNIPEN (Guyon & Schomaker, 1994). Although the availability of huge databases for system training and development potentially improves the performance of existing algorithms due to the wider coverage of handwriting shapes, it is very likely that many algorithms are not well fit to handle the case of an infinitely large training set. Neural network-based approaches, but also approaches based on hidden-Markov models both run the risk of satiation, where the system yields an average but incomplete representation of all possible handwriting shapes. Similarly, brute force matching methods run the risk of becoming computationally impractical, when all possible character shapes (allographs) have to be considered. This study is directed at the development of procedures to obtain an insight in the underlying variation of shapes within large quantities of handwriting data from several writers. At this stage, it is useful to make a distinction between two source of variation in handwriting shapes:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

5 Conclusions

In this paper we have presented the NPen ++ system, a neural recognizer for writer dependent and writer independent on-line cursive handwriting recognition. This system combines a robust input representation, which preserves the dynamic writing information, with a neural network integrating recognition and segmentation in a single framework. This architecture has been shown to be well suited fo...

متن کامل

Handwritten Character Recognition Adaptable to the Writer

I n this paper, we describe the handwritten character recognition adaptable to the writer. I t is efficient when the specific writer uses the same OCR for many characters. At the early stage, input characters are recognized using genera 1 dictionary, and then the correctly recognized character modify the dictionary to be adaptable to the variation of the characters of the specific writer. Using...

متن کامل

The Use of Dynamic Writing Information in a Connectionist On-Line Cursive Handwriting Recognition System

In this paper we present NPen ++, a connectionist system for writer independent, large vocabulary on-line cursive handwriting recognition. This system combines a robust input representation, which preserves the dynamic writing information, with a neural network architecture, a so called Multi-State Time Delay Neural Network (MS-TDNN), which integrates rec.ognition and segmentation in a single f...

متن کامل

Similarity Measures for Writer Clustering

JAYASHREE SUBRAHMONIA IBM T.J. Watson Research, P.O. Box 218 / Route 134, Yorktown Heights, NY 10598, U. S. A. E-mail: [email protected] This paper addresses the problem of improving the performance of an online, writer-independent, large-vocabulary, unconstrained, handwriting recognition system by clustering writers with similar writing styles. Recognition performance is enhanced by identify...

متن کامل

Template-based Writer-independent Online Character Recognition System using Elastic Matching

A writer independent handwriting recognition system must be able to recognize a wide variety of handwriting styles, while attempting to obtain a high degree of recognition accuracy. As the number of writing styles increases, so does the variability of data distribution. We describe here a template-based system using a string matching distance measure of linear time complexity for the recognitio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012